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Abstract 

[Purpose] To convert music into images or visual art having reality. 

[Constitution] In the device having a motion database 107 for storing the motion of an articulated 
object, a motion generating section 106, on the basis of the chord detected by a chord detecting section 
105, searches motion of the articulated object from the motion database 107 and generates motion data 
when the volume buildup point is detected by a peak detecting section 103. 

[Effect] it is possible to generate, from music, images having definite shape and motion, such as the 
motion of a human dancing to the music. 
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Specification 

[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] 

The present invention relates to a musical image conversion device for converting musical 
information into images. 
[0002] 
[Prior Art] 

As an example of a prior art of converting sounds into images, a technique is disclosed in which 
the motion of sounds in each part of a music is made to correspond to the motion of images having a 
basic shape (JP-A-63-184875). 
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[0003] 

[Problems to be Solved by the Invention] 

However, the prior art above has a drawback. That is, since the motions of sounds are made to 
correspond directly to the motions of images, the images obtained are abstract without relevant 
motions as a whole, and the visual art intended by the related art ends up in being abstract art. 
[0004] 

The object of the present invention is to provide a musical image conversion device for 
converting music or acoustic art, into images or visual art having reality. 
[0005] 

[Means for Solving the Problem] 

To achieve the above object, the present invention comprises: a motion database for storing the 
motion of an articulated object, a peak detecting section for generating timing to switch the motion 
from musical signals, a chord detecting section for detecting chords from musical signals, and a 
motion generating section for selecting motions from the motion database based on the chords in time 
to the timing to switch the motion. 
[0006] 
[Function] 

A certain number of motions made within a certain period of time by an articulated object such 
as a human are stored in advance in a database. With a sound buildup point as a cue, a motion is 
searched according to the chord constitution of that part. .Musical information is converted into a 
series of pieces of motion information by appropriately switching the motion at every point of sound 
buildup. Motions of the articulated object are applied to the image of the articulated object to display 
an animated image. 
[0007] 

[Embodiment] 
(First embodiment) 

An embodiment of the present invention will be described below in reference to appended 
drawings. FIG. 1 shows the constitution of an embodiment of the present invention. 
[0008] 

First, music is inputted as electric signals of voice through a musical signal inputting section 101. 
A volume detecting section 102 takes sound volume information out of musical information inputted 
through the musical signal inputting section 101. In the case the musical information inputted is 
electric signals of voice, since the square of that value is the volume information, its envelope line can 
be obtained as the volume information signal. A peak detecting section 103 detects a volume buildup 
point from the volume information signal obtained with the volume detecting section 102. Arranging 
such that only the buildup exceeds an average volume for a certain period of time (about 2 or 3 
seconds) immediately before, detection of small sound buildups can be prevented. 
[0009] 

A tone interval detecting section 104 detects tone interval information from the musical 
i nformation inputted through the musical signal inputting section 101. Electric signals of voice are 
subjected to frequency analysis using Fourier transform. Tone intervals used in music are the values 



expressed with the equation 1: 

[0010] 

(Equation 1) 

f(n, m) = 440 * 2" 712 * 2 m (Hz) 
where, 
n:0, ...,il 
m\ integer 

[0011] 

The basic tone la of 440 Hz is f(0, 0). The letters n and m denote the intervals by a half tone 
and an octave, respectively. Therefore, the frequency component I(n, m) of the frequency f(n, m) 
expressed with the equation 1 is obtained with the tone interval detecting section 104. A chord 
detecting section 105 detects a chord according to the tone interval information obtained with the tone 
interval detecting section 104 at the volume buildup point obtained with the peak detecting section 103. 
First, a tone that is displaced by exactly an octave is deemed to be the same tone and their components 
I(n) are calculated with the equation 2: 
[0012] 
(Equation 2) 

m 

[0013] 

A certain number of n's are taken out in the order of decreasing magnitude of component I(n) 
and combinations of the n's are detected as chords. 
[0014] 

A motion generating section 106, on the basis of the information on the volume buildup point 
detected with the peak detecting section 103 and the chord detected with the chord detecting section 
105, searches motion of the articulated object from the motion database 107, and synthesizes. FIG. 4 
shows an example in which a human is deemed to be an articulated object. Motions of limbs of the 
articulated object shown in FIG. 4 are represented with a series of numerical values of position 
coordinates and angles of joints, and stored in advance in the motion database 107 . A volume buildup 
point detected with the peak detecting section 103 is made a motion changing point. The kinds of 
chords detected with the chord detecting section 105 are matched in advance with the motion data 
stored in the motion database 107, and the motion occurring after the motion changing point is 
searched by the kind of chords. Joig^ 

data are made to change smoothly from the precedent motion by applying a spline processing or the 
like. A motion output section 108 visualizes to display the motion data generated with the motion 
generating section 106. ~> 
[0015] 

As a result, an articulated object moving in time to the music may be displayed using the / 
pre-stored motion data. ^ 
[0016] 

(Second embodiment) 
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Next, another embodiment of the motion generating section 106 according to the present 
invention is described in reference to FIG. 2. A chord classifying section 201 classifies into kinds the 
chords detected with the chord detecting section 105. 
[0017] 

An example of determining the kinds of chords with the chord classifying section 201 is 
described in reference to the flowchart of FIG. 5. Here, three n's (nl, n2, and n3) are selected with 
which chord components are great, assuming mil < nl < n3. In the case the values of (m2-iml) and 
(n3-n2) prove to be 4 and 3 respectively as a result of checking the values, the chord is of the major 
key. In the case they prove to be 3 and 4, the chord is of the minor key. For example, in the case a 
chord of do, mi and sol is detected, since nl, n2 9 and n3 are 3, 7, and 10, respectively, the chord is 
determined to be the major key. In the case of la, do, and mi, since nl, n2, and n3 are 0, 3 and 7, the 
chord is determined to be the minor key. Some other determinations may be made by the two 
differential values above. 
[0018] 

In the case the determination cannot be made, the chord is rotated by replacing the n2, n3 and nl 
+ 12 used so far with new nl, n2, and n3, and determination is made. For example, if a chord of la, 
do and fa is present, the nl, n2 and n3 are 0, 3 and 8, respectively, so it cannot be determined to be 
either major or minor key. So determination is made by rotating the chord with new nl, nl and ini3 of 
3, 8 and 12, respectively. If the determination cannot be made, one more rotation is made and 
determined. If the determination fails three times, it is determined to be 'unidentidied* and the 
.process is finished. 
[0019] 

In FIG. 2, a motion searching section 202 searches motion data corresponding to the chord 
classification results obtained with the chord classifying section 201 from the motion database 107. 
A motion synthesizing section 203 smoothes by the spline processing or the like the motion data 
searched with the motion searching section 202 and the motion data found so far and synthesizes so 
that the motions are consistent. As a result, the motion of the articulated object may be varied 
according to the atmosphere, the contents of the musical chord. 
[0020] 

(Third embodiment) 

Next, still another embodiment of the motion generating section 106 according to the present 
invention is described in reference to FIG. 3. A bar recognizing section 301 determines a motion 
change point based on the volume buildup point detected with the peak detecting section 103. 
[0021] 

An operation sample of the bar recognizing section 301 is described in reference to FIG. 6. A 
bar is defined as the shortest unit time for performing one motion. The graph shows a volume buildup 
time point detected with the peak detecting section and the volume at that time point. The volume is 
indicated along the vertical axis of the graph while time is along the horizontal axis. It is assumed 
that the inputted musical information represents a rhythm that is distinct to some extent and the 
volume buildup point is approximately one of time points of 0, la, 2a, ... . In the drawing, 'bar time' 
refers to a period of time for performing one motion, 'elapsed time within a bar* refers to the time 
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elapsed within a bar, and 'bar head time' refers to the time point at the head of a bar. 
[0022] 

It is assumed that the volume first builds up at the time point 0. A motion starts from that point. 
The next buildup time point is assumed to be 'a.* When the volume at the time point 'a' is greater 
than that at the time point 0, the time point 'a' is defined as the reference time point; otherwise the 
time point 0 is defined as the reference time point. And the period of time 4 a' is defined as the unit 
time. As the unit time is determined at the time point 'a,' the bar time at that time point is 'a.' 
When the time point 'a' is assumed to be the reference time point, the time elapsed within the bar is 0, 
and the bar head time is 'a.' 
[0023] 

At the time point ft 2a/ there is no input and so it is deemed to be in the middle of the bar. 
Therefore, the bar time is 2a, the time elapsed within the bar is 'a/ and the bar head time remains 
unchanged, 'a.' 
[0024] 

At the time point *3a/ although there is an input, since the input is rather small, it is hard to 
determine whether it denotes the head or intermediate point of the bar. Therefore, that time point is 
determined to be 'at the head or in the middle' of the bar. The bar time is determined to be 2a or 3a, 
and the time elapsed within the bar is determined to be 0 (the bar has changed) or 2a (the bar has not 
changed). Since input is small also at the time point 4a, the state at the time point 3a is carried over. 
[0025] 

When a great input is given at the time point 5a, that point may be deemed to be the head of a bar, 
and the inputs at the time points 3a and 4a are determined to have been given in the middle of the bar. 
Therefore, the bar time may be deemed to be 4a, and the time elapsed within the bar is deemed to be 

5a. 

[0026] 

Thereafter, in the case a great input is given every time a period of 4a elapses, it is deemed there 
is no contradiction, and the bar time 4a is maintained. The sound volume is judged according to a 
reference which is the average of the sound volume during a preceding period of a few seconds. The 
bar recognizing section 301 outputs the bar head time. The unit time 'a' is dynamically set according 
to the volume buildup points of several preceding bars. 
[0027] 

The motion synthesizing section 203, according to the time span between motion change points 
determined with the bar recognizing section 301, determines at what speed the motions are to be 
synthesized. For example, in case a motion reproduction time T is 4a, motion data must be 
interpolated and corrected so that one motion is finished over the time T=4a. When it is assumed that 
the motion data stored in the motion database 107 are for M frames per one motion and that the display 
system requires motion data of D frames per second, in order to display the d-th frame on the display 
system, the motion data of the m-th frame should be used as shown in (Equation 3). 
[0028] 
(Equation 3) 
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Md 

m = 

TD 

[0029] 

Since m is not an integer generally, motion data for display are obtained by interpolating a 
motion data of (integer portion of m)-th frame and a motion data of (integer part of m plus l)-th frame. 
[0030] 

As a result, it is possible to roughly set the motion change point according to the speed of the 
piece of music and to generate motions according to the music. 
[0031] 

(Fourth embodiment) 

Next, an embodiment of the motion output section 108 according to the invention is described. 
The motion output section 108 displays the motion data outputted with the motion generating section 
106 by applying them to the three-dimensional shape model of an articulated object generated by 
computer graphics. A three-dimensional shape model of a human shape as shown in FIG. 4 for 
example is used as the articulated object, and joint position coordinates and joint angles of the motion 
data are applied to the three-dimensional shape model. 
[0032] 

As a result, the manner of motion, in time to the music, of an articulated object such as a human 
having a definite shape may be visualized. 
[0033] 

(Fifth embodiment) 

Next, the fifth embodiment of the invention is described. As music information, signals 
corresponding to performance control information, such as MIDI control signals are inputted to the 
musical signal inputting section 101. The MIDI control signals have interval information and sound 
volume information for individual tones. The volume detecting section 102 may take out volume 
information as a piece of music by calculating the sum of sound volume information of tones present at 
timing points for which tone producing information is present. Since it is easy to take out interval 
information with the tone interval detecting section 104, thereafter it is possible to carry out a similar 
process to that when electric signals of voice are inputted as musical information. 
[0034] 

Thus, load of interval detection and volume detection processes is alleviated. 
[0035] 

[Effect of the Invention] 

According to the invention as described above, it is possible to convert the acoustic art, the 
music, into specific visual art, the motion of articulated object having a definite shape, for example the 
motion of a human dancing to the music. 

[Brief Description of Drawings] 

FIG. 1 is the block diagram of the constitution of an embodiment of the present invention. 

FIG. 2 is the block diagram of an example constitution of the motion generating section of the 
embodiment. 
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FIG. 3 is the block diagram of another example constitution of the motion generating section of 
the embodiment. 

FIG. 4 is a line drawing of a human as an example of articulated object of the embodiment. 
FIG. 5 is a flowchart of the chord classification procedure of the embodiment. 

FIG. 6 shows the function of the bar recognizing section which determines motion change points 
using volume data in time sequence. 

[Description of Reference Numerals] 
101: Musical signal input section 
102: Volume detecting section 
103: Peak detecting section 
104: Tone interval detecting section 
105: Chord detecting section 
106: Motion generating section 
107: Motion database 
108: Motion output section 
201: Chord classifying section 
202: Motion detecting section 
203: Motion synthesizing section 
301: Bar recognizing section 
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